W«»5*'-J8gS 1 S Gerard intrc-DwMB MAC Technical M— o random 16 October 1970 K*M*chuset«* Institute of Technology PROJICf MAC Mueechueett* 02139 -^aaiSK^SSW iSR Ife- £1 jkoNwVMadM 4ftaiaW(a*T 3AM 0*M T38tt|gf «33»«tf£>»aMlf sg$&ntg)»'D S6k_ *■- "i "~* j*"*^ ~™ -trS-- »H'~- V*w -^-"-^fr *i©^i» ; ..sr-=-™. PSEUDO-RANDOM SEOJJENCES Technical Memorandum 16 (This report was reproduced from an M.S. Thesis, MIT, Department of Electrical Engineering, June 1970- ) Gerard Bruere-Dawson October 1970 PROJECT MAC Massachusetts Institute of Technology Cambridge Massachusetts 02139 ;=S53j3JKbSK ma AOCMOVLEDGHENT The author wishes to thank Professor Albert R. Meyer for his guidance in supervising this thesis and also for his suggestions for the general idea of this paper and many of the proofs. Miss Marsha E. Baker deserves the credit for having done a beautiful job of typing this thesis. * Work reported herein was supported in part by Project MAC, an M.I.T. research project sponsored by the Advanced Research Projects Agency, Depart- ment of Defense, under Office of Naval Research Contract Nonr-4102(01) . jV. i iAmmklUSBsSmmK* TABLE OF CONTENTS 'tfsqij PAGE 3 4 I ll'PIHMIIWI»TIT CHAPTER I VON MISES* DEFINITION CHAPTER II SEQUENTIAL TESTS CHAPTER III DESCRIPTIVE COMPLEXITY CONCLUSION REFERENCES 5 8 30 40 52 54 INTRODUCTION The purpose of this paper is to study some notions of randomness for infinite sequences of O's and l's. We consider S - Sj 8 2 ... s n ... , where 8l is either or 1 with equal probability; this probability being independent of the value of the other elements of S. If such a sequence is obtained by choosing or 1 at random, one can state properties ^ which will be verified, with probability one, by some of the initial segments S n of S. We denote by S - s, s, ... s„ the initial segment of length n of n 1 l n S = s s ... s ... . For example: The limit of the relative 12 n frequency of 1 in S n is \, when n increases indefinitely. Given such a property or law, we can say that a sequence is random if it has this property. We shall restrict our attention on effectively testable properties, and see under what conditions one can generate effectively (i.e. with a program) sequences with some of these properties. Such sequences will be called pseudo-random. We consider now some of those properties in order to define random sequences. a) The first point will be that a random sequence is unpredictable. That is to say, given an initial segment of the sequence there is no way to predict accurately what will follow. This notion has been introduced by Von Mises, who defined: The probability of an event is the ultimate frequency of the occurrence of this event, after an infinite number of independent trials. Upon this, he built up the definition of "kollektiv" or random sequence. We will state precisely this notion, as well as the formulations of Wald and Church, in the first chapter. b) Second property: If a sequence is easy to describe; if it contains some kind of pattern for example, then, it is not likely to be random. Kolmogorov formalized this idea and defined the descriptive complexity for finite sequences. This complexity is then used to define random infinite sequences. c) The third property will use a completely different approach, formulated by Martin-Lof. Indeed, if we want to speak of random elements of a set S, we need to introduce a probability measure on S. Random elements are then characterized by properties verified by a subset of S of measure 1. There we introduce a measure u on the set of infinite binary sequences : = {0,1}". Which is the w-product of the equiprobable measure on (0,1). This measure is defined for the Borel sets of the topology T, with basis: £*:- *&%&< [x x_ ... x ] : set of infinite sequences beginning with x. x 9 ... x n , 12 n where x. is either or 1. Therefore to say that a property is verified almost everywhere on fl" (with *ea«*ct to u) ie equivalent to say that there is aft open set of a**itr*rily small neasare including the set of exceptions to this property. This will give us the definition of sequential teart in the second chapter* ™*'«<ikW**IU&L *tu#f tSie d»Cacrelationa of those coocapts and consider extensions of those definitions. ;~ : jf <■';:;■> ':?-*\ ■ .. CHAPTER I VON MISES' DEFINITION We first present the concepts of Von Mises about randomness, then the related works of Wald and Church. We give three definitions of random sequences or "Kollektivs". Those definitions will lead us to those of gambling procedures and to some interesting areas of research, as the study of Pseudo-random sequences. 1.1 Definition of a Kollekti v A Kollektiv is an infinite sequence S = S- s„ ... s ... over a finite alphabet, such that: 1) Any element of the alphabet occurs in S with a certain limiting frequency. 2) If we select an infinite set of indexes (places) I = (i,, i 2 » ..■i n "«)» such that the decision as to whether or not i f I depends only on the value of a <j e s s ,...); then each element U n' 1' 2' ' i -1' 1 +1 n n occurs with the same limiting frequency in the subsequence: S=s, s ..... s ... . 1 *i l 2 We will in this paper consider sequences of O's and l's and the case where the limiting frequency of (or 1) is J. Definition Let X = {0,1}* - set of finite binary sequences A selector f is a 0-1 valued function defined on X. A place Selection P f is the mapping which associates to each infinite sequence S = s 1 s 2 . . . , the subsequence (which may be finite) P (S) = s s ... s. r 1 1 X 2 n where i. = the least i such that f(s 1 s 2 ...s^) = 1 i = the least i, bigger than i^, such that f(s x s 2 ... s^) = 1 Notation . Given a place selection P f and an infinite sequence S. We write P for P f P(S) for the subsequence selected by P on S P(S ) for the subsequence selected on the first n digits of S. n # P(S ) for the number of elements selected #_ P(S ) (#. P(S )) for the number of 0's (l's) selected. 1.2 Definition of A.WALD [13] Wald noticed that one ought to put certain restrictions on that definition, since: Given an increasing sequence of integers [n^J, the process which extracts from S — s. s« ... s .... 1 l n the subsequence S' = s s ... s ... n l n 2 n m is a place selection. 10 But if we consider all possible sequences (nj, for any S with an infinite number of 0's or l's there will be one [nj such that S =1 identically (or S =0) n. n. Therefore no sequence would be a "kollektiv . To avoid this, WALD proposed: Definition : Consider a countable set of places selection including that one which selects every place; a sequence will be a "kollektiv" if: For all places selection selecting an infinite number of places the limiting frequence of the number of 0's (or l's) is j • 1.3 Definition of Church [1] We can remark that, in order for a system of place selection to be applicable, we ought to be able to reproduce each place selection indefinitely. It is equivalent to say, by Church's thesis, that the selectors of those place selections are recursive function of X. Formally Given a one-one recursive encoding t of X in [0,1,2,.. .1 a place selection P f is effective, iff the function f(t(«)) is a recursive function. Definition . A sequence S will be a "kollektiv" (in the sense of Church) iff for all effective places selections, the limiting frequency of the number of 0's (or l's) is j • Note that there is a countable number of places selections, and thus, this definition is consistent with that of WALD . 11 1.4 Definition of Gambling Algorithms For defining random sequences we shall use an equivalent notion. Notation : Given an infinite sequence S, S can be considered as the characteristic function of a subset E of (1, 2 } i.e. n g v » s n = * We shall often write S for V„ and not distinguish between them. Definitions : A gambling procedure is a total function g(.x, S) of one integer and one set variable such that: g(x, S) = ¥(x, S - {x}) for some total recursive Y with range in {0,1,2}. These may be called F (future )-algorithms, for emphasis. A P(past)-algorithm is as above with g(x, S) = Y (x, S n {0,1,. ...x-D). Definition : Y(g, S, x) = |{y <=x | g(y, S) t 2 }| = number of guesses up to x by g about S. k(g, S, x) = |{y £ x I g(y, S) = s y }| = number of good guesses up to x *Cf . H. Rogers Theory of recursive functions and effective computability. Chapter 15. a(g, S, x) = 12 k(g, S, x) Y(g, S, x) We call a(g, S, x) the accuracy of g on S at x. a(g, S) = lim a(g, S, x) if such a limit exists X -♦ oo Definition : A procedure g is applicable to S iff lim y(g> s > x ) - + A procedure g has a bJLas for S iff lim , _ N 1 a(g» S, x) > 2 X = 03 Having defined gambling procedures, we shall show the equivalence, within our subject, of place selections and P-algorithms . Given a P algorithm- P, we define: P the place selection which selects s n whenever P(n, S) = 0, and does not select otherwise. P will do the same thing for P(n, S) = 1 Theorem Given a sequence S if there exists a P-algorithm , P, applicable to S such that either 1) rt (P, S, x) has no limit when x -» + oo 1 or 2) lim a (P, S, x) * J then there exists a place selection Q such that either 13 or 15 *0 Q (S n> > 1 n "* " # Q (S n ) 2 15 AliV , 1 -"■ 2 * Q (S n ) n ^ co Proof : The first two conditions are equivalent to either a) lim a (P, S, x) > - or b) lim a (P, S, x) < T If the second case is true, we may use a P algorithm P' P'(S, x) = 1 - P(S, x) if P(S, X) jt 2 P'(S, x) = P(S, x) - 2 otherwise. Then for P' — a (p; S, x) > \ . So we can restrict our attention to case a). That is: (3£ > 0) (" x) [ct(P, S, x) > 7 +6 ] Considering now P Q and P, We have Y(P, S, n) -f P Q (S n ) +# P 1 (S n ) 14 k(P, S, n) - # P Q (S n ) + # x P x (S n ) , , *0 W , , +1 ? 1 (S n ) Let a (n) = # Pq (Sn> ai (») = # ?i ( s n ) O0(n). * Pq (Sg) + 0^00.* P 1 (S„ ) a(P, S, n) = — # P (S n> + + P l (S n> In order to prove this theorem, we have to consider two cases Case 1 ): P guesses a finite number N of, say, 0's. Then for large n such that: a (P, S, n) > - + £ 1 2 N a x (S n ) > ( 2 + £ ) (1 + # ?i (s n ) > " 4 P x (S n ) As N/# P (S ) can be made arbitrarily small: In (3£ > 0) [ ai (Sn) > \ + £ (i. o)] Case 2 : P guesses an infinite number of 0's and l's. Then a (P, S, n) > J + £ =» ^(n) > J + £ or a Q (n) > J + 6. Then since (In) [ a (P, S, n) >| + £ ] one of a-, or a is infinitely often bigger than j + £ . Q.E.D. 15 Let us now consider random sequences Definition (Church) A sequence S is random iff for any P-algorithm P lim a (P, S, x) = £ X -» oo Notation : We shall write "random (VM) " for random in the sense of Von Mises (Version of Church). 1.4.1. Discussion The first obvious remark is that a periodic or ultimately periodic sequence like S = 001001001 will not be a random sequence. Moreover, there is no way to predict accurately a random (VM) sequence. We have the important property, using the measure n defined in the introduction over o = {0,1} . Theorem (Church) u {S j S is random (VM)} = 1 That is, a sequence S is random almost surely. In order to prove this theorem we shall use another theorem (optional sampling Theorem) 16 t Theorem (Doob) [14] Given a place selection P such that: P(S) is an infinite sequence, almost surely. Let P(S) = S 1 = sj, s^ ... s^ ... . then P : S -» P(S) is a measure preserving transformation. That is: if A is a Borel set on Q, then p" 1 (J\) is also a Borel set and n(A) = u[P (A)] Proof : Consider the set v of sequences S's such that: (1) (3 P applicable to S) "\ [lim a (P, S, x) = J 1 We shall prove that v has measure 0. We know, by what is above, that (1) is equivalent to : (3 P applicable to S) lim a (P, S, x) > J 1 Then E = U U r whereT = (S | ( 3 x) a (P, S, x) > \ + - 1 P m P,m P,m Since there is a countable number of P's. (v p> ( v m) tn<r; jtn )! =0] « »<?y m ° Assume for some P and m : f^ m > 0. Then from P,we extract P Q and P x (See Theorem above) 17 V = (S | S f y,, the subsequence P n (S ) has a frequency of z m v = (S I S e y.» the subsequence P, (S ) has a frequency of I'-^J+ij 7 - ?„ U y x . Assume n (V, Q ) > Let v' = (S | the frequency of zero's > 2 + in ^* °*^' (v s e t: ) p (s) ^ -o* But applying the strong law of large numbers, we have and V jQ c P" (V^) => ^CEq) = 0. Therefore n(V.) =0 Q.E.D. Remark : In the proof of this theorem we use only the fact that there is a countable number of places selections. The theorem would be true also for a larger definition of random sequences, e. g. using r.e. gambling procedures where g(x, S) divergent is interpreted as a no guess answer. 18 We know that the characteristic function of a recursive set is not random (VM) , since this characteristic function is recursive. This is also true for the characteristic function of a recursively enumerable set (or its complement). Fact : The characteristic function of a r.e Set is not random (VM). Proof : If this set is infinite, we use a recursive function enumerating it without repetition, to construct a guessing algorithm. Remark : By the same method; if the set, or its complement includes an infinite r.e set, then it is not random (VM). This leads us to the question: how non-recursive a sequence S has to be in order to be random? We shall use the Kleene hierarchy of sets. The Kleene hierarchy classifies the "arithmetical" sets in classes V , rr > 0, 1 2,... defined as follows: n n v is the class of all sets A of the form: n A = [(a 1 ,...,a m ) (Q 1 x^ (Q 2 x 2 )...(Q n x r ) P(a 1 ,...,a m> x^...^); where P(a n ,...,a , x, ,...,x ) is a recursive predicate, the Q„, , , are 1 ' m 1' ' n 2k+l existential quantifiers and the Q~, are universal quantifiers. tt is the class of all sets A as above except that the Qni-.i are universal quantifiers and the Q„, are existential quantifiers. 19 Summarizing some basic properties of this hierarchy of sets (see Rogers): (1) y = _ = r, n n, = the collection of all recursive sets; (2) A f v » A e TT n . all n 5 H") v c- Y, . tt cr ^ « • r , a tt . , and n C tt . . all n; (3) V, n C ^ n+1 , TT n <- , n+1 » ^ H n+1 n n+1 (4) ^ <t n and tt <t y for n > fo v ii,, rv ^ tt , for n > and containment v ' ^n u n n+1 n+1 is proper. Then we have Theorem (Loveland [6]): There exists recursively random sets properly in Ai = v; n tt. for i = 2, 3... 1.4.2. Critique of this Definition From the point of view of gambling it may be that a particular sequence is guessed accurately by a P algorithm, but there is no effective way to find this good gambling procedure. The notion of "accurately guessing" is also artificial, may be that we have, for a sequence S, a procedure which will be accurate infinitely often, but this procedure may be also very inaccurate most of the time. 20 For a random sequence and a given P algorithm it may also be that the accuracy goes to j from above. This does not correspond to the intuitive idea of a random sequence. More precisely, we will state a theorem proved by J. Ville [12] (in a slightly different form). Theorem (Vllle ri2l ); There exists a sequence S randon in the sense of Church, such that: In any initial segment of S, the number of 0's is not greater than the number of l's. 1.5 Areas of Research 1) We can use a wider class of gambling procedures Kruse [15] studied this extension in its set theoretic aspect. Given a set and a probability measure on this set, we can define arbitrarily random elements as long as we keep the condition that almost surely any element is random. In a more restricted point of view, we may consider gambling procedures in the arithmetical hierarchy. 2) We will in 1.5.1 present some remarks about the general F-procedures as defined above. 3) One cannot effectively construct random sequences in the sense of Von Mises. But if we impose a bound on the time necessary to guess elements of a sequence, we can define and (effectively) construct sequences which will look random to all "fast" gambling procedures. We investigate this in 1.5.2. 21 1.5.1 F-Aleorithms Recall that a F-algorithm Is a function F(x, S) such that F(x,S) = ¥ (x, S - {x}) where Y is a general recursive function of one integer and one set variable, ranging in {0,1,2}. Our motivation for considering F-algorithms can be explained as follows : D. Loveland [5], made the remark that there are selection rules, practically applicable, which are not place selections. On this basis he built an extension of the Von Mises theory of random sequences. Our extension of P-algorithms to F-algorithms follows the same pattern. A P-algorithm, applied to a sequence S, is analogous to a process of extrapolation. For example, if we consider S : s s^ ••• s n as the history of S up to time n, we. want to predict what will occur next. On the other hand, a F-algorithm is analogous to a process of interpolation. Then, if we consider a sequence S we want to predict the value of s . For this purpose, we can ask a finite number of questions about other elements of S. In the next section, we will compare F and P-algorithms. Our results are inconclusive. We will state a few results and remarks. The first point is that Doob's theorem (optional sampling theorem) no longer applies to extended place selections. Definition ; A F-place selection is a place selection with the difference that the selector f is a function of a set variable and an integer: 22 f(S, x) = ¥(S -{x), x} where ¥ is general recursive. We have Fact: There exist F-place selections which modify the distribution. We define a selector f: if s 2 = 1 •C: f(s, D =t >therwise f(S, n) = 1 for n = 2, 3, ,. We see that with probability 3/4, the first element selected will be 1. However, it is certainly possible to define random sequences with respect to F-algorithms as in Section 1.4. This class of random sequences will certainly be included in the class of random (VM) sequences. In some special cases, we can decide whether the containment is proper. Theorem : If we consider gambling procedures with outputs in (0,1), that is guessing at each argument, then: there are sequences which are random with respect to P-algorithms but not for F-algorithms. Proof : We know already flSffi)(if?pP)[a (P,'S) = p Where P is the set of P-algorithms (1) 23 Considering such a sequence S = s x s^-.s^.., and defining S — s , s n ••• s 12 n s i = s i s • = f If (3 y < x) (y + 2 y = x) Then s^ else s x A F-algorithm may thus ask the value of s ^ and gives this value as output. If will be always right. Assuming there exists a P-algorithm with a bias on S i.e. , p s . n = k(P,S',x) >h + ~ (i.o), for some p ^ N atP, ^ , x; Y (p > sjx) 2 p Here, with our restriction, y(P, S', x) = x. k(P,Sjx) 1.1 s o a (P s S',x) = > 2 T + _ P n Let R = range [f(n)l, where f(n) = n+2 Then (V x ^ R) [s n = sM Ther are less than log x elements in R smaller than x, Since y + 2 y ^ x => 2 y < x * y < log x. So if we apply the same algorithm to S, it will have a(P,S,x) - k(P.S,x) k(P,S,x) - log X X x 1.1 . 1°« x * 2 p x * f + i a..) which is in contradiction with equation (1) Q.E.D. 24 To conclude this section, we give, without proof, a theorem of Loveland, stated in [5] in a different and more general form: Theorem (Loveland) : Ther exists a random (VM)sequence S and a F-algorithm, F, such that a(F,S) = 1. 1.5.2. Pseudo-Random Sequences A recursive sequence of O's and l's is obviously not random, in the sense of Church. Nevertheless we would like to construct sequences which would be very difficult to predict. To measure this difficulty, we will use the complexity theory as introduced by Blum: Let P.R. be the set of partial recursive functions, R being the set of recursive functions. Definition : For a Godel numbering f«p ± } of P.R. , a sequence $ $ ...$,... of functions in P.R. , is a measure of complexity iff: 0' 1 n 1) dom $. = dom qj. 2) X (x, y)[$ i (x) = y] is a recursive predicate. Definition : Given a function g f R. f in R. is g-computable iff (3 cp = f) [$ t (x) <; g(x) except in a finite number of points]. 25 We want now to introduce a measure of the complexity of a P-algorithm P(S,x), which can be considered as a function recursive in the set S. g P(S,x) = cp. (x) for some i S S We define for a function cp, (x), a relativized measure § as a function recursively enumerable in S, with ' the properties: 1) (VS) [domain (§?) - domain (cp 1 >] 2) \ (i,x,y) [f.(x) = y] is a predicate recursive in S. The complexity of P(S,x) is noted n(S,x). n(S,x) may represent the number of steps necessary for a universal Turing machine, with an oracle on S, to compute P(S,x). Definition : Given g in R, a P-algorithm P is g-computable on S if tt(S,x) ^ g(x) for all but a finite number of x's. Definition : Given g in R, an infinite sequence S is random at level g, if any P-algorithm g-computable on S guesses r with a limit accuracy j • Our aim is to generate recursive sequences random at level g. Since a sequence S is recursive, (ft i) ^00 = s x Obviously if S is random at level g then the function f(x) = s x is not g-computable. The reverse is not true and was proved by A. Meyer and J. McCrieght. Theorem (Mever) : (V g £ R) (V d g R) (q 0.1 valued C c R) (J ± ) [cp £ = C =» ^(x) > g(x) a.e. and (V x) (C(x) = 1 => (V y) (x < y <: x + d(x) * C(y) = 0))] 26 Informally this means that there exist recursive sequences, difficult to compute, with an arbitrarily small frequency of l's. In order to construct a pseudo-random sequence, we could use an extension of a procedure by Levin, Minsky, and Silver. This procedure given a countable set of P-algorithms yields a sequence which is random with respect to this set, and recursive in an enumerating function of that sequence. We present a different construction, which yields a recursive sequence random at level g and which allows us to control the variations of accuracy. Theorem ; For any g € R, and any positive computable function £(x) with lim P(x) =0 and lim x£?(x) = + », one can construct a sequence S such that x -♦ »: 1) S is random at level g. 2) If we let Y(P,S,x) = |{y £ x | P(S, y) + 2 * tt(S, y) < g(yM | k'(P,S,x) = !fy < x I P(S, y) = s y *tt(S, y) < g(y)l a '(P,S,x) =k'(P,S,x)/Y'(P,S,x) Then [ \ -S(y) < a'(P,S,x) < \ +£(y», with y = y'CP.S.x)] (a.e.) Proof : We use a diagonal argument developed by McCreight By for some i* definition a P-algorithm P(S,x) can be written as ^(x) - P(S,x) 27 S s To each cp. 00 we associate the measure i^ (x) , as above. Consider the list (cp.}. This is not an enumeration of the P-algorithms , but all P-algorithms are in this list. Assign to each element in the list an initial weight w Q (i) = l/(i+i)' This weight may be changed at some stages. We define S in stages. At stage x, S ..is defined and we compute s . x-1 x Stage x x-1 x-1. Define I»{i«x| ^ W s gOO) where $ i (x) * g(x) means that cp^OO did not use the oracle for y a x and that ^00 <. g(x). Let A(x) = (i f I ] cc^OO - 0} BOO = {i 6 I | cp ± ( x > " V W = v W(i) W = v W(i) a ifA(x) i^B(x) Let 9(x) _&2L> Two cases a) If w > w, a b r l) multiply all weights in A(x) by l-fl(x) and all weights in B(x) by 1 + 9 00 ^2) set s = 1 b) If w <; w, a b *1) multiply all weights in A(x) by 1 + e(x) and all weights in B(x) by 1 - e(x) 2) set s =0 x END. 28 To show this algorithm gives us the desired result; we define for a P-algorithm P, and its complexity Property 0(n): (3 x) [ Y '(P,S,x)=n and a '(P,S,x) > \ + £ (n) ] Consider a P-algorithm P(S,x) = cp.(x). Assume it has property Q(n). Then at stage x w(i) = C x w Q (i) m p with c = n (i-9(x i )) n d+e(y i )) i=l i=l where and m + p = n and -f- > 7 + £ (x) ^ m+p 2 *" {x.} are the points z: P(S,z) ± s i z {y.} are the points z: P(S,z) = s _ /1 / \\n,, , / x N n+C(n)n Then C 2: (l-e(x)) (l+9(n)) , ** But (l-e(n)) n (l+e(n)) n4tie(n) -» + 00 when n -♦ + «. As x ;> n, If P(S,x) has property Q infinitely often, then C^ can be made arbitrarily large. 00 Initially y; w (i) < + 00, and at any stage this sum cannot i=l increase. Therefore this sum will remain finite, for all i. 29 Thus any P-algorlthm has property Q at most a finite number of times. We have also proved : a'CP.S.x) < f + £(y) (».••) y-Y'(P,S,x) Assuae a , (P,S,x) sf *g(y) (i.o) y-y'(P,8,x> Itien from P, we can deduce another P-algorithm P', with the same complexity on S, such that a'CP'.s.x) * | + £(y) (i.o) y - Y , (P. , .s,x) Which is a contradiction. Moreover If P is g-computable on S, then a CM) - * '■•'"' *&***«**>** S is random at level g. Q.B.D. 30 CHAPTER II S CIENTIAL TESTS 2.1 Pur pnsps and Definitions As we saw in the preceding chapter, the definition of random sequences using a countable number of place selections (or equivalently gambling procedures) presents some inconsistencies with the intuitive notion of randomness. See, for example the remark of Ville. So let us look back at probability theory. Consider a random sequence as illustrated by an indefinite repetition of independent events, with a finite number of possible outcomes (e.g. or 1). Extending the notion of probability for a finite number of possible events, we defined in the introduction a probability measure u over the set of infinite sequences of 0's and l's: Cl = (0,1)". We characterize random elements in this set by the properties they have almost surely . So, with Martin Lof [8], we will say, informally: Definition : A sequence w f Q is random if it satisfies all "almost surely" type theorem of probability theory. 31 For example, a sequence random in the sense of Wald, will obt»y the strong law of large numbers, (which is of first order) but not the law of the iterated logarithm (which is of second order). In order to make precise this notion, we look at what we call "almost surely" type theorems. An "almost surely" type theorem is a property verified by a subset of measure 1 of Q. Examples: Strong law of large number : The limiting frequency of the number of l's in a sequence u 6 Q, is almost surely 2 . Law of the Crated logarithm : Let a n (") be the sum of the n first digits of the sequence ". Then Jf m (resp Urn ) ^ToTToTn = + X <"">■ ^ almOSt ""* For such a theorem, then, we can say that the set of all sequences violating the law has measure zero. By definition this means that to every £ > there exists an open U covering this set such that n cu) < e • For x g {0,1}*, let [x] denotes the set of all infinite sequences beginning with x. Then, instead of II, we may consider the set U - (xl [x] clA) C {0,1}* Note that, conversely u = U M xpU 32 If and only if t| is open. We say that 1/ is generated by U. Further, U has the property that it contains all possible extensions of any of its elements (sequentiality) ; y being an extension of x (in symbol y r- x) if the string y begins with x. In other words, U may be regarded as the critical region of a sequential test on the level £ . The definition of a null set may hence be stated in statistical terms as follows: For every £> 0, there exists a sequential test on that level which rejects all sequences of the set. In order to be able to construct effectively these tests we will impose some more restrictions. That is for a given sequential test: 1) the U's are recursively enumerable. 2) given m, one can effectively enumerate U^ such that n [ U [xl] < 2 -m x^U c m * Definition 2.1.1 . Let X = (0,1} A subset U of N y X is a sequential test if 1) U is recursively enumerable. 2) Its intersections U = (x P Xl (m,x) f U] are sequential. m ■ That is, (V x € V (V y <= X) [x y f Uj 3) We have X > D fl 2 Uj a D 2 3 ... 4) And 1/ the open generated by U ', verifies m , OBJ < 2-. 33 From each sequential test U in N X X, one obtains immediately a sequential test CU^x,...) in Q. The relation between l\ and U is reversible. We shall consider then as synonyms. All effective tests from probability theory of the space (with u) are sequential tests in this sense. The main property of this definition is that we can define a test which will include all tests of the type defined above. Definition 2.1.2 A sequential test U is called universal if there exists, to each sequential test V an integer C s such that: v_ c U (m - 1,2,...) m+C m 2.2. Properties Theorem (Martin-Lof) 2.2.1 There exist universal sequential tests. Sketch of the proof : We define a Godel numering of the set of all sequential tests: U (0) , U (2) U (m) ,... and we define U universal by its intersections : (i) i=0 m+i U = IJ U m J Definition 2.2.2 . A sequence is random in this definition (we shall write random (ML)) if it doesn't belong to n'U „, for 1i universal. m*l 34 We will call this set the universal constructive null set. This set is the union of the null sets: fll^, for a* 1 sequential tests Vf . Since the universal constructive null set has measure zero, we have: p ac t : Almost all sequences are random (ML). We shall first study the position of some random (ML) sequences in the arithmetical hierarchy, then relate this definition to the notion of Kollektiv. Theorem 2.2.3 : The characteric function of a recursively enumerable set (or its complement) is not random (ML). Proof ; Let A be this set and h a recursive function enumerating A without repetitions. We assume A infinite. Let h = (n first elements of A enumerated by h}. n < U n = (s ! h n cs} We have 1) u[4| ] £ 2 -n , since n coordinates are fixed 2) the corresponding U n is sequential and 3) u n+1 c U n for all n 4) Since h is recursive, U n is also recursive. Then if A or its complement is r.e., its characteristic function C A will not be random (ML). 35 Corollary : C A is random (ML) implies that A and its complement intersect every infinite r.e. set. So there is no random (ML) sequences, in v^ and tt^ We show that there is one in A 2 = ^2 n "2 ^ in faCt A 2 " ^1 U "l^ Theorem 2.2.4 . There exists A 2 sequences which are random (ML). Proof ; we construct a characteristic function recursive in a recursively enumerable set. The sequence representing this characteristic function will be random (ML). Let U be a universal sequentail test, x and y binary strings. We know that uOllj^ <: 1 Define C as follows : x e C « (3 n) (V y) [length (y) = n * X y g t^] Note x g U x =» x g C Since U, is recursively enumerable V = {x|(v y) [length(y) -nsxypD.] is also r.e. x f C « (3 n) [x f V n J ' So C is recursively enumerable. Moreover \J.(fU j) «£ J im P lies that at least half of the sequences of length n are not in C. Using an oracle for C, we now construct a sequence S, in stages. stage Assume ^ C, Let S l . - stage n+1 Assume S n ^ C, then S n 1 or S n doesn't belong to C Q . If S 1 / C then S^ +1 = S n l otherwise S n+1 - S n 0. END 36 No initial segment, of the sequence S so constructed will be in C, and therefore cannot be in Uj soS^, S is random (ML) Q.E.D. Using the same method, we could prove the following theorem, given without proof: Theorem 2.2.5 . There exists random (ML) sets properly in fi. = ^. H n t for 1*2,3,... Let us now investigate the relation between this definition of randomness and that of Church. Let VM = [Si there exists a P-algorithm with a bias for S}. VM is the set of sequences not random in the sense of Church. Theorem 2.2.6 . If a sequence is random (ML), then it is random (VM) Proof ; Given a P-algorithm P, we extract the two corresponding places selections: P Q , ? v as in chapter I. We define the set of finite strings (for m, n integers): A q, ° = [S* f X US initial segment of S ' , such that: n ' ' 1) # P Q (S) - n. q 2) #0 P (S) /#P (S)> £ + 1) ^ Let Ck. ' be the open set of Q generated by A n 1) We have a„ q - u - 1 j r m«n m 37 where r-» <1> _ ( S | xhe first m places selected by P Q contains a number m of O's > ( \ + k >•) From Doob's theorem, if S„ is the initial segment of length a of S: n p rr*q>°) = p fs|S contains more than m ( T + jr ) O's) r ' m r ' m z A Let X - 7 +->>)>> 7 2 q We have q,0 n» „ p(r } * 2^". r; ( ) r m P*** P If we consider the entropy function for 2 <A ^ 1; H(A) => logr + (l->) log ^ We have p(f q,0 } 2 "»2-HOA) r m 9 m (H(A)-l) £ 2 Since /X> ?=* 1_H <A> = a > 0. »° l ^ 9 "ft™ r • m r » m So r .... _ • m ___ ^.2 a m^n m" 38 P,0 2 -an n+1 lerefore P (*JL_ ) < ~ * l r n i n a, The 1-2" We have also q,0 q,0 2) A J _. a A (V n) n+1 n q,0 q,0 3) x f A n =» (w y) [xy € A n ]. q,0 4) a is r.e. n Let q,o q,0 U = A . where fx] is the smallest Integer greater n 1+n L a J than x. Then U q, ° is a sequential test in the sense of Martin- Lof. n q,i We could, in the same manner, construct U n corresponding to T> ^ and some q. Now if a sequence S is not random (VM) , (3 P, P-algorithm) (3 q) [ a [P,S,x] ■> 2 +J (i.o)] which implies q*,i (^ q') (V n) [S cUl n ] where i is either or 1 , and hence S is not random (ML). Q.E.D. 39 2.3. As a conclusion we can say that, some random (VM) sequences are not random (ML). This was affirmed by Martin- Lof in a private communication, here is a straight- forward proof: The law of the iterated logarithm, as given above, may be stated as a sequential test. Thus, all sequences which do not obey this law will be in the null set of that test. Therefore, a sequence which does not obey the law of the iterated logarithm will not be random (ML). But by Ville [12] theorem, one can construct a random (VM) sequence, where the ratio of the number of l's to the number of O's, is always not less than 1. This sequence will not follow the law of the iterated logarithm and therefore will not be random (ML). 40 QHAPTER III riESCRTPTIVE COMPLEXITY After this survey of gambling algorithms and measure theoretic arguments, we shall take another point of view, related to the difficulty of description of a random sequence. We can note that, if binary strings of length n are classified according to the number of 0's in them, the strings in the largest class will have [n/2] zero's. Thus, if we draw a string at random, it will have with a great probability, approximately the same number of 0's and l's. By the same kind of combinatorial argument, a string of length n, chosen at random, has a small probability to contain some kind of periodic pattern. From those simple remarks, we can deduce another definition of randomness : One characteristic of a binary string, made entirely with l's (or 0's) or a repeating pattern, is their easy description. In fact, it is sufficient to have the length of the string and the pattern in order to reconstruct it. Kolmogorov, in [10] , formalized this idea. He gave a new definition of the relative entropy H(x|y), x being a binary string. This entropy will measure the difficulty of description of the string x, given the information y (e.g. its length). This will enable us to define random elements as those with the largest entropy. 41 But, we have to make clear that a description of some element x, will make sense only if we make precise the means to reconstruct x from it. We consider, in this chapter, a description of a binary string, as a "program" for a specific Turing machine (with binary strings as outputs). Note that we restrict our attention to effective reconstruction; but we might as well consider more powerful means (using Turing machines with an oracle, for example). Given a Turing Machine A, with two inputs, we define: p is the program for x, *(p) its length. Definition : The Kolmogorov Complexity (or descriptive complexity) of a string x with respect to algorithm A is: K A (x) = mln H(P) A A(p)=x If there exists a string p such that A(p) = x, otherwise K A (x) = ». The Kolmogorov conditional complexity of x given y, with respect to A, is : If such a p exists, otherwise K A <x|y) = ». There are several basic properties noted by Kolmogorov [9] which apply to all two measures. 42 Fac t 1 There exists a universal algorithm B such that for an arbitrary algorithm A and for x. Kp (x | e(x)) < K A (x| e(x)) + c where c depends only on A and B. Fact 2 If B 1 and B» are two universal algorithms then there exists a constant c, such that for all x: |K_ (x| £(x)) - K (x| A (x))| £ c B l a 2 So two universal algorithms are equivalent up to a constant. We shall hereafter refer to K^fr] ?.(x)) for an universal algorithm U as K(x| ^Cx)). Fact 3 . There exists a constant c such that K(x| 4(x)) <; n+ c for all x Fact 4 Less than 2 r strings of length n satisfy K(x| n) < r. 3.1 Definition of Random Sequences Consequently, we are led to the definition: A finite string x will be random if K(x| £(x)) is maximum for £(x) fixed. We want to generalize this definition to infinite sequences of 0*s and l's. 43 First idea: a sequence will be random if all its initial segments are random. And since the complexity is defined up to an additive constant, we would have: 00 S, is random » (3 c) (V n) [K(S n |n) > n-c ] But such sequences, as shown by Martin Lof, do not exist. Indeed, from probability theory, we know that a random sequence has almost surely continuous sequences of 0*s or l's. It is clear that the description of such segments of infinite sequences can be substantially simplified. More precisely Theorem (Martin-Lof) [17]) : Let f be a real-valued function: If £ 2~ f (n) = + oo then n=l for all S £ Q (™ n) [KCSjn) < n -f(n)]. Therefore if we want to have a consistent definition with the fact that almost all sequences are random, we can have: Definition I (Chaitin F8D: The set C of patternless or random infinite binary sequences is JO C = (S! (v n) K(S I n) > n-f(n)} oo n where f (n) = 3 logn or any function such that • ,-f (n) ' y 2 < + oo n=l 44 we will verify below that u(C ) '« 1. 00 Definition II (Mart in-Lof): The set R of random infinite binary sequences is R = {S| (3 c) (" n) [K(S n ! n) > n-c] we will also show that u(R) =1, and that R cr C^. 3.2. Properties We use the second definition. We will write "random in the sense of definition II" as "random (K)". Theorem 3.2.1 nfSjS is random (K)} = 1. Proof : Let K be the complement of R in ft: S € K « (V c) (3 n Q ) (v n * n Q ) [K.(S n |n) <: n-c] K-n IJ Tn.c T n ,c - (S|K(S n !n) ^ n-c] c n Q nsfv "'t; >0 i*2-.- n tn r n>c i^ o We know 45 V C= n,n r Fn ' C V ^ V 1 '' =» u[U B 1 * 2" c n 0' * |.i[K] = We prove now that definition I gives also a set of measure 1. Theorem 3.2.2 . Let f (n) be a function such that 2 -f(n) < + ^Then for almost all S: ($ n) [K(S n |n) > n-f(n)] 00 y, n=l Proof Let v = {S|(3 n) [K(S n |n) * n-f(n)]} E = U ! P P " ( s i K < s J n > * n ~ f < n >} n n2tl ! n n ,-f(n) By Fact 4: ^^r? < 2 and " 2 " f(n) < + oo implies by Borel- Cantelli Lemma, that n=l n(v) =0 To prove that a random (K) sequence is also random in the sense of definition I, we use: Theorem 3.2.3 . (Martin-Lof ) Given a function and if given m,one can compute effectively N such that f(n). If ? 2- f(n) < + » n-1 n-N 46 Then, for all random (K) sequences (V n) [K(S n |n) > n-f(n)] The proof uses f(n) to construct a sequential test and uses Theorem 3.2.4 below. Theorem 3.2.4 . S is random (K) => S is random (ML). Proof . Assume that for some universal sequential test: 00 n II n=l This means (V m) (3 n Q ) (V n 2: n Q ) [S n flJ J As 1) U is recursively enumerable S 6 ■ ■ v, m m=l (1) 2) U contains less than 2 n " m strings of length n. We can describe an algorithm A generating S n : Assume S n t U m . A is given a program for generating the element of U, the number m and a number z less than 2 n ~ m . Given such data, A will enumerate U m and outputs the z t element. The program of A will therefore be of size: Constant + log m + log f™ = n-m + log m + constant 1) => (V m) (V n) [K A (Sjn) <; n-m + log m + c] By Fact 2: KCSjn) S K A (S n |m)'+ c ? (V m) (V n) [K(S n |n) ^ n-m + log m + c " ] where c' 1 is a constant independent of S and n. 47 Thus (V c) ($ n) [K(S n |n) < n-c] So S is not random (K) 3.3 Definition of K as a Null Set o f a Teat x = {o,i} : T = (x f Xl (V y f X) [K(xy|e(xy)) <: f,(xy)-m]} * Then 1) T is sequential: x £ T => (V y € X) [xy 6 T ] m 2) Tj. 2 T 2 a .... a T o a .. -m 3) tT = open in n, generated by T : u( £"") < 2 m mm Let ^~= O C , C is a null set. Remark. The T above defines a sequential test in the sense of Mar tin-Lot m except for recursive enumerability. What is the degree of unef fectiveness of T. Theorem 3.3.1 . T is tt 2 • Proof . Let Y. = (x|K(x| £(x)) < £(x)-m} — — — tn v is a recursively enumerable set. m T = (jc|(v y) [xy € v 1} m ' m orxf T e» (V y) [xy £ T.] m m 48 Therefore T and T are of the degree of tt 2 - m Fact Using definition II we can see S is random (K) « S P fl ^ „■ We have therefore defined K as the null set of a test in tt 2 - Using the same method that in Chapter II. We can see: ThPorem 3.3.2 . There are characteristic functions of sets in A 3 which are random (K). That is an open problem whether T is properly in n °. Or equivalent^, since we know by Theorem 3.2.4 that M L C K: if this inclusion is proper. By the precedings, we know that this notion of randomness is the largest of the three in the sense that, its null set of non-random sequences includes the other two. 3.4. Pspndo -Ran dom Sequences Given a Turing Machine A, we defined the complexity of a string x, as the minimal length of a program with which A will generate x. We can, as in Chapter I, put a bound on the time necessary for A to generate x. We shall present in this section a few remarks and a theorem of McCreight [2], which may lead to interesting research. Given a Turing machine A, with two inputs x,y, we define a measure of the computation of A,' with inputs x,y, we call this measure a(x,y): 49 1) a(x,y) is finite « the computation A(x,y) halts 2) a(x,y) = z is a recursive predicate. We can take as example the number of steps of the computation A(x,y). npfinition 3.4.1 . Given a recursive function g, the pseudo-complexity of a string x = x^ x ••• x n K 1 (xjn) = min £(p) g A(p,n)=x where A is a universal algorithm and (V i * n) [A(p, i) = x x x 2 ... Xi and a [p, i) * S^^ This definition is related to this of uniform complexity by Loveland [5]. We give a tentative Definition . Given a recursive function g, a sequence S is random at level g, if ($ n) [K' (S In) > n-f(n)], where f(n) will be fixed further. v g n Rabin gives a construction of a 0.1 valued recursive function cp^ such that $. (x) > g(x) (a.e. ). But with this property, the sequence S = 9. (1) cp. (2)... C p i (n>... is not necessarily random at level g. Assume that S has the property: (3 n) [K' (Sjn) * n-f(n)] 50 Then, for arbitrarily large n, the string S n = s x s 2 ---s n can bo described by a program p: i(p) < n_3 log n. We can construct a new program for S of the form a Y Y P. where y is some binary sequence which serves to separate a, P, P- a is the binary encoding of the following instructions: "Given input x, see if x-f(x) <. length of the string to the right of the second y. If so simulate A(p,x) and give the x th digit of the output as result. Otherwise compute s according to the instructions given by P" Let i be the Godel number of this program : log i - k+n -f(n) where k is a constant. Let h(n) = n - f (n). The program cp ± for the sequence S, will have the property: $.(x) s. g(x) on at least [h (log i-k)] inputs. Let us now state a theorem by McCreight [2] Theorem : (V £ > 0) (V g € R^ <■* k 6 N) (3 0.1 valued c 6 V (V 1 € N) [«. = c =, $,(x) > g(x) for all but k +(1 + g) log i values of x]. i i and moreover this c is effective given g,£. . Then we let f(n) = £*.n and construct the function c of the theorem above with £< £ . 51 If the sequence c(l), c (2) ,. . . ,c (n) ,. . . were random, then there exists arbitrarily large i such that loei-k CDi = c and a ± <. g on at least 1 * , , which is ultimately larger than (l+£ ) log i + c. This sequence is therefore random if we take f(n) = E.n We have: Theorem . Given g, and £ > 0, one can effectively construct sequences such that (v n) [K' (S n !n) > (l-£)n]. Remark . It remains an open question whether the bound in McCrieght theorem can be decreased to k + log i. If that is so, then we would be able to construct sequences such that (3 c) (" n) [K'g(S n !n) > n-c] which would be a definition of pseudo-random sequences consistent with the Kolmogorov definition. 52 CONCLUSION Each of these three definitions of randomness is in fact based upon some set of statistical tests. We considered in this paper only effective tests, i.e., usable with computer programs. We could expand this theory to a more complete form by considering tests arbitrarily high in the arithmetical hierarchy. If we let denote the recursive T-degree , we can define a n-guessing algorithm as a total function g(x, S) of one integer and one set variable. g(x, S) = ¥(x, S n {1, 2,...,x-l}) where Y is recursive in the n th jump of 0:0 (n . A sequence S will be random (VMn) if no n-guessing algorithm has a bias for it. Note that Ville's result applies again: there are random (VMn) sequences which do not follow the law of the iterated logarithm. The definition of Chapters 2 and 3 could also be generalized by considering sequential tests as subsets of N X X recursively enumerable in (n) , this will give us the definition of random (MLn) sequences; and we can generalize the Kolmogorov complexity by using Turing machines with an oracle for (n) , and hence define random (Kn) sequences. Each of these extended definitions will yield a set of random sequences of measure 1. The inclusion results hold also for these definitions relativized to (n) . A random (Kn) sequence will be random (MLn) and, in turn, a random (MLn) sequence will be random (VMn). Rogers: Theory of recursive functions. Chapter 13. 53 One other aspect of this paper, which could lead to interesting developments, is the generation of pseudo-random sequences. Using a measure of the complexity of a test, we constructed sequences which look random to all simple ra n dom n ess tests. Those sequences could be useful for the generation of random numbers wit* any distribution, with application in various fields , like simulation or coding theory. 54 REFERENCES [1J A. Church: On the Concept of Random Sequence. Bulletin American Math. Soc. 46 (1940) (pp. 130-135). [2] E. McCreight: A note on Complex Recursive Characteristic Functions. Unpublished. [3] E. McCreight and A. R. Meyer: Classes of Computable Functions Defined by Bound on Computation. ACM Symposium on Theory of Computation. (1969) [4] Levin, Minsky and Silver; On the Problem of the Effective Definition of "Random Sequence". Memo 36 (revised) M. I.T. Computation Center. [5] D. Loveland: A New Interpretation of the Von Mises ' Concept of of Random Sequence". Zeit, Math. Logic und Grundl. Math. 12.1 (1967) [6] D. Loveland: The Kleene Hierarchy Classification of Recursively Random Sequences. Trans. AMS. 125-3 (497-519). [7] Martin-Lof : The Definition of Random Sequences. Information and Control. 9 (1966). [8] Martin-Lof: Algorithm and Randomness. Europ. Meeting of Statist. London (1966). [9] Kolmogorov: Three Approaches to the Quantitative Definition of of Information Transmission. (Trans, for Russian) Problemy Pederachi Informatsii (1965). [10] Kolmogorov: Logical Basis for Information Theory and Probability Theory. IEEE Trans, on Info. Theory. Vol. IT 11 No. 5 (1968). [11] Knuth: The Art of Computer Programming (Vol. 2) Seminumerical Algorithms (Addison-Wesley) [12] Ville: Etude Critique de la notion de Collectif. Gauthier. Villars. [13] Wald: Sur la notion de Collectif dangle calculdes Probabilites. Cptes. rendus Acad, des Sc. Vol. 202: (1936) pp. 180-183). [14] Doob. Note on Probability. Annals of Maths. (2) Vol. 37 (1936). [15] Kruse: Some Notions of Random Sequence and Their Set-Theoretic Foundations. Zeit. f Math. Logik u Brundl d. Math. (1967) UNCLASSIFIED Security Classification DOCUMENT CONTROL DATA - R&D (Security classification of title, body of abstract and indexing annotation must be entered when the overall report is classified) 1. ORIGINATING ACTIVITY (Corporate author) Massachusetts Institute of Technology Project MAC 2a. REPORT SECURITY CLASSIFICATION UNCLASSIFIED 2b. GROUP None 3. REPORT TITLE Pseudo-Random Sequences 4. DESCRIPTIVE NOTES (Type ol report and inclusive dates) Technical Memorandum 5. AUTHOR(S) (Last name, first name, initial) Bruere-Dawson , Gerard 6. REPORT DATE October 1970 7a. TOTAL NO. OF PAGES 5A 7b. NO. OF REFS 15 8a. CONTRACT OR GRANT NO. Nonr-4102(01) b. PROJECT NO. 9a. ORIGINATOR'S REPORT NUMBER(S) TM-16 96. OTHER REPORT NOISI (Any other numbers that may be assigned this report) 10. A VAIL ABILITY/ LIMIT ATION NOTICES Distribution of this document is unlimited. 11. SUPPLEMENTARY NOTES None 12. SPONSORING MILITARY ACTIVITY Advanced Research Projects Agency 3D-200 Pentagon Washington, D.C. 20^01 13. ABSTRACT Three definitions of random binary sequences are presented. The consistency of those definitions with the laws of probability theory, and the inclusion relationship of the three sets of random sequences, are investigated. These sequences, considered as characteristic functions of sets, are then placed in the Kleene arithmetical hierarchy. Some restrictions on these definitions, using Blum's complexity theory, lead to the definition of pseudo-random sequences, which can be generated effectively. 14. KEY WORDS Recursive Functions Church Random Sequences Sequential Tests Probability Laws Descriptive Complexity Kleene Hierarchy DD/hSSI. 1473 (M.I.T.) UNCLASSIFIED Security Classification